An Effective Software Clone Detection Using Distance Clustering
نویسنده
چکیده
D. Gayathri Devi 1, Dr.M.Punithavalli *2 Assistant Professor, Department of Computer Science, Research Scholar Karpagam University, Coimbatore, Tamilnadu,India 1 [email protected] * Director and Head, Sri Ramakrishna Engineering College, Coimbatore, Tamilnadu,India 2 [email protected] Abstract – As the computer is a rapidly evolving, there is tremendous need of software development for different purpose. The complexity of software development differs, and the developers take the easier way of implementation by copying fragments which leads to code clone. This paper presents a technique for detecting code clone using fragment distance with clustering. First, we tokenize the source code into tokens. Second, by distance and clustering we find the similarity until all clusters are merged. Third, we evaluate and find the code fragments using distance cluster DC and finally, we provide the examples using distance. Keywords− Metrics, Edit Distance, Detection, Code Clone, Fragment
منابع مشابه
خوشهبندی دادههای بیانژنی توسط عدم تشابه جنگل تصادفی
Background: The clustering of gene expression data plays an important role in the diagnosis and treatment of cancer. These kinds of data are typically involve in a large number of variables (genes), in comparison with number of samples (patients). Many clustering methods have been built based on the dissimilarity among observations that are calculated by a distance function. As increa...
متن کاملEpm–rt–2011-01 Levenshtein Edit Distance-based Type Iii Clone Detection Using Metric Trees
This paper presents an original technique for clone detection with metric trees using Levenshtein distance as the metric defined between two code fragments. This approach achieves a faster empirical performance. The resulting clones may be found with varying thresholds allowing type 3 clone detection. Experimental results of metric trees performance as well as clone detection statistics on an o...
متن کاملEfficiently Measuring an Accurate and Generalized Clone Detection Precision using Clone Clustering
An important measure of clone detection performance is precision. However, there has been a marked lack of research into methods of efficiently and accurately measuring the precision of a clone detection tool. Instead, tool authors simply validate a small random sample of the clones their tools detected in a subject software system. Since there could be many thousands of clones reported by the ...
متن کاملSupervised Deep Features for Software Functional Clone Detection by Exploiting Lexical and Syntactical Information in Source Code
Software clone detection, aiming at identifying out code fragments with similar functionalities, has played an important role in software maintenance and evolution. Many clone detection approaches have been proposed. However, most of them represent source codes with hand-crafted features using lexical or syntactical information, or unsupervised deep features, which makes it difficult to detect ...
متن کاملAn Optimization K-Modes Clustering Algorithm with Elephant Herding Optimization Algorithm for Crime Clustering
The detection and prevention of crime, in the past few decades, required several years of research and analysis. However, today, thanks to smart systems based on data mining techniques, it is possible to detect and prevent crime in a considerably less time. Classification and clustering-based smart techniques can classify and cluster the crime-related samples. The most important factor in the c...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013